Structural analysis of chat messages for topic detection

نویسندگان

  • Haichao Dong
  • Siu Cheung Hui
  • Yulan He
چکیده

Purpose This paper studies the characteristics of chat messages from analyzing a collection of 33,121 sample messages gathered from 1700 sessions of conversations of 72 pairs of MSN Messenger users over 4-month duration from June to September of 2005. The primary objective of chat message characterization is to understand the properties of chat messages for effective message analysis such as message topic detection. Methodology/Approach From the study on chat message characteristics, an indicative term-based categorization approach for chat topic detection is proposed. In the proposed approach, different techniques such as sessionalization of chat messages and extraction of features from icon texts and URLs are incorporated for message pre-processing. And Näıve Bayes, Associative Classification, and Support Vector Machine are employed as classifiers for categorizing topics from chat sessions. Findings Indicative term-based approach is superior than the traditional document frequency based approach for feature selection in chat topic categorization. Originality/Value This paper studies the characteristics of chat messages and proposes an indicative term-based categorization approach for chat topic detection. The proposed approach has been incorporated into an instant message analysis system for both online and offline chat topic detection. Preprint submitted to Online Information Review 2 May 2006

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Topic Modeling for Answers Detection in Online Game Chats

Helping behavior is a significant part of social learning process in online games. One type of such a behavior is answering questions in a chat. We provide a method to detect if the question asked in a chat was answered and by whom. Proposed method is based on topic modeling for chat messages and comparison of a detected topic of question with a topic of possible reply. We show its efficiency o...

متن کامل

Automated Chat Thread Analysis : Untangling the Web

As networked digital communications proliferate in military operational command and control (C2), chat messaging is emerging as a preferred communications method for team coordination. Chat room logs provide a potentially rich source of data for analysis in after-action reviews, affording considerable insight into the decision-making processes among the training audience. The multitasking natur...

متن کامل

Detection of Topic Change in IRC Chat Logs

We attack the problem of topic segmentation in the domain of Internet Relay Chat logs. In this process, we examine the previous work in text segmentation using a variety of methods. After considering the pros and cons of the methods, we employ Text Tiling, pause detection, and latent semantic analysis because they did not require the usage of large pre-tagged corpora. With these systems in plac...

متن کامل

Traffic Scene Analysis using Hierarchical Sparse Topical Coding

Analyzing motion patterns in traffic videos can be exploited directly to generate high-level descriptions of the video contents. Such descriptions may further be employed in different traffic applications such as traffic phase detection and abnormal event detection. One of the most recent and successful unsupervised methods for complex traffic scene analysis is based on topic models. In this pa...

متن کامل

Damage detection of structures using modal strain energy with Guyan reduction method

The subject of structural health monitoring and damage identification of structures at the earliest possible stage has been a noteworthy topic for researchers in the last years. Modal strain energy (MSE) based index is one of the efficient methods which are commonly used for detecting damage in structures. It is also more effective and economical to employ some methods for reducing the degrees ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Online Information Review

دوره 30  شماره 

صفحات  -

تاریخ انتشار 2006